-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(agent): Win32 pseudo-console-based agent launcher executable #378
Conversation
when building the MSIX
That allows enforcing the final location in the package Both agent and launcher will stay in the same directory.
GUI applications rely on some sort of event loop to mediate the unpredictable timing user input will happen. The Windows OS models this by means of the window message loop and the WndProc function that says how the window responds to messages it receives. We want to wait on ReadFile to drain the console pipes, so the child process won't block on the next printf after the pipe becomes full. We also want to wait on the agent process so we can quit if it quits. We also want to respond to messages sent by the OS (such as WM_QUIT). Threads without actual windows still have the message loop. We can combine the window approach with some waiting functions and create a custom message loop that allows our thread to sleep while we don't receive any relevant input neither on the thread queue, nor on the console pipe or the agent process termination. Creating the pseudo-console should be declarative. Starting the agent process under the pseudo-console should be a function call. Reacting to the agent's process termination or new input from the console pipe should resemble reactive programming by passing functions. I don't want to see tons of Win32 in my main() I hope you enjoy this small spec. :)
Leveraging FormatMessage function to generate a string message out of the system's catalog, based on the HRESULT value. Plus some sugar with std::source_location. That will come handy in avoiding freeing memory and closing handles in multiple code paths. A word on `unique_string`: Because FormatMessage allocates a buffer and expects the caller to later free it using Windows Heap functions this small wrapper based on std::unique_ptr makes it very handy to deal with such confusing allocate/deallocate scenarios. Logging: we don't do it everywhere, this is a very tiny application. But if an exception occur, than we should. Thus, LogSingleShot(): opens a file, writes to it and closes it. In a single shot. Not the best performance, but meant to be called just once, before quitting.
Destructor is just a bunch of checked CloseHandle calls and the most important ClosePseudoConsole call. Construction is more interesting: creates anonymous pipes to serve as the console stdio HANDLEs. The console geometry seems irrelevant for our use case, but I opted by passing it for further exploration.
By far the most complicated part of this feature. Starting a child process under a pseudo console requires: 1. a STARTUPINFOEX structure filled with the handles the pseudo console uses for stdio 2. a LPPROC_THREAD_ATTRIBUTE_LIST containing the key PROC_THREAD_ATTRIBUTE_PSEUDOCONSOLE mapped to the HPCON (pseudo-console device handle). This list has to be allocated in the heap, since the size of its elements is hidden from the application code. We need to try to initialize it first to fail deterministically, just to learn how many bytes needs to be allocated. Then allocate the required amount of memory and try again, which should succeed. Then we fill the key-value pair we desire by calling UpdateProcThreadAttribute. Notice that's not related to the current process or thread, but for the one we'll start. 3. Then we CreateProcessW 4. Care must be taken to avoid leaking the process/thread attribute list. Disposing it requires calling DeleteProcThreadAttributeList followed by freeing the memory. Again, the unique_ptr with custom deleter idiom comes in handy. This is scoped to the process creation, once done we no longer need such structure. Since this program is meant to start only one process and live in sync with it, we don't need to either cache that thing nor make it visible outside of this routine.
Although we never create or show a window, this is still a GUI application Thus it's powered by the window/thread message loop I mentioned before. The need for the event loop might still not be clear enough. Let me detail it: 1. Pseudo-console stdio HANDLEs are anonymous pipes HANDLEs. 2. The child process inherits the parts it needs to implement stdio. 3. The parent (this program) holds the other end of the pipes. 4. If stdout pipe becomes full, the child process will block at the next attempt to printf. 5. Thus we need to periodically read from the pipe to drain it. 6. We can do whatever we want with the bytes we read, nothing inclusive. 7. But we must read it. 8. Calling ReadFile on an empty pipe blocks the caller until new data arrives. 9. The child process might quit without writing anything new. 10. The parent will wait forever. 11. Users could kill this program (via task manager, for example) 12. That causes the process to receive WM_QUIT (amongst other related messages) 13. We need to respond to that message (and others maybe). A lot of asynchrony. MsgWaitForMultipleObjectsEx to the rescue. Seems made for this problem. We can wait on multiple HANDLEs that the OS can interpret as synchronizeable, i.e. that has some state that can change meaningfully, such as a process that terminates or a pipe that receives new data. Not to be confused with Overlapped IO. That's a whole other world. It also waits on new messages arriving to the window or thread message queue. If configured accordingly it blocks until one of the interesting activities happen, letting us know which one, so we can handle it accordingly. The association between which HANDLE to watch and which behavior to adopt could be hardcoded in the event loop itself, but I found so beautiful it passed as a parameter at the event loop construction. Otherwise the event loop would be part of the PseudoConsole class, making it like a God.
As the fulltrust process and startup task in the appxmanifest.
instead of the real agent, which is now a CLI application Also updates test code that builds the agent The GUI expects the launcher now.
d85e754
to
8d8b22a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments, but good stuff in general.
Regarding the pipes: can't we simply pass the same STDIO from the parent process without pipes? To reduce complexity.
There is no STDIO for a Windows application unless we create a console by calling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎆 Great!
squashing commits to keep the history clean. Deletes copy constructor and operator from Process Consequently, applies rule-of-five. And springle some noexcepts in the special member functions. Add a catch(...) clause log path directory to always under LOCALAPPDATA Thus, renaming MakePathRelativeToEnvDir to UnderLocalAppDataPath Throws an exception if the LOCALAPPDATA path is too big to fit in a MAX_PATH sized buffer. Uses at instead of [] to index vectors The at method is checked and throw exceptions if out-of-bounds. Avoid overwrite previous logs operator= must return a value ;) Plus some pedantic header inclusion
461742f
to
f5e7183
Compare
Not sure whether this should be marked as feature or bugfix. The aim is to prevent WSL API to cause terminal windows popping up when the agent uses it to take an action inside a WSL instance. That happened because WSL API launches console processes, so if the parent is not a console process, a new one is created. There is no public API to control that behavior.
So the solution is to compile the agent as a console application, so it hosts the WSL API processes. To make it still invisible, another layer of indirection is added: a native Win32 window process is created, but instead of showing a window, it creates a pseudo-console device and hosts the agent under it, as if the launcher was a skinny version of
conhost.exe
. Being a pseudo-console, we can fully control how it's presented to the user, or not presented at all, as done in this PR.The logic behind creating a pseudo-console device and using it to host child processes is quite convoluted, in my opinion, lots of possible error paths requiring freeing resources, thus I made it more C++-ish, with RAII like semantics, either by leveraing classes or being creative with
std::unique_ptr
.For more information about this thingy, please check out:
https://learn.microsoft.com/en-us/windows/console/creating-a-pseudoconsole-session
and
https://devblogs.microsoft.com/commandline/windows-command-line-introducing-the-windows-pseudo-console-conpty/
The GUI, its tests and the appxmanifest were also updated to reflect that the "agent's background process" must be the launcher.
A nice side effect is that, if the launcher is killed, let's say by the task manager, the agent quits cleanly, deleting the
addr
file as if it receivedCtrl-C
. That was not (and still is not) true for killing the agent itself, it has no chance to react.